AITopics | hellinger distance

Collaborating Authors

hellinger distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Empirical Bayes 1-bit matrix completion

Matsuda, Takeru

arXiv.org Machine LearningMay-12-2026

Matrix completion is a fundamental problem in machine learning, where the objective is to recover missing entries of a partially observed matrix. A prominent example is the Netflix Prize (Bennett and Lanning, 2007), which involved predicting a matrix of movie ratings by users for recommendation purposes. Beyond recommendation, matrix completion has recently found applications in causal inference for panel data (Athey et al., 2021). A standard assumption in matrix completion is that the underlying matrix is approximately low-rank, reflecting a few latent factors that govern interactions between rows and columns. A substantial body of work has established theoretical guarantees and developed efficient algorithms for matrix completion (Cai, Cand`es and Shen, 2010; Cand`es and Recht, 2008; Keshavan, Montanari, and Oh, 2010; Mazumder, Hastie and Tibshirani, 2010; Recht, 2011), predominantly focusing on cases where the observed entries are continuous-valued. In many applications, however, observations are not continuous-valued but binary.

artificial intelligence, machine learning, matrix completion, (17 more...)

arXiv.org Machine Learning

2605.09509

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Sample Complexity of Forecast Aggregation

Neural Information Processing SystemsApr-28-2026, 14:13:10 GMT

We consider a Bayesian forecast aggregation model where nexperts, after observing private signals about an unknown binary event, report their posterior beliefs about the event to a principal, who then aggregates the reports into a single prediction for the event. The signals of the experts and the outcome of the event follow a joint distribution that is unknown to the principal, but the principal has access to i.i.d. "samples" from the distribution, where each sample is a tuple of the experts' reports (not signals) and the realization of the event. Using these samples, the principal aims to find an ε-approximately optimal aggregator, where optimality is measured in terms of the expected squared distance between the aggregated prediction and the realization of the event. We show that the sample complexity of this problem is at least Ω(mn 2/ε) for arbitrary discrete distributions, where m is the size of each expert's signal space. This sample complexity grows exponentially in the number of experts n. But, if the experts' signals are independent conditioned on the realization of the event, then the sample complexity is significantly reduced, to O(1/ε2), which does not depend on n. Our results can be generalized to non-binary events. The proof of our results uses a reduction from the distribution learning problem and reveals the fact that forecast aggregation is almost as difficult as distribution learning.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.67)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Neighbor Embedding for High-Dimensional Sparse Poisson Data

Mudrik, Noga, Charles, Adam S.

arXiv.org Machine LearningApr-21-2026

Across many scientific fields, measurements often represent the number of times an event occurs. For example, a document can be represented by word occurrence counts, neural activity by spike counts per time window, or online communication by daily email counts. These measurements yield high-dimensional count data that often approximate a Poisson distribution, frequently with low rates that produce substantial sparsity and complicate downstream analysis. A useful approach is to embed the data into a low-dimensional space that preserves meaningful structure, commonly termed dimensionality reduction. Yet existing dimensionality reduction methods, including both linear (e.g., PCA) and nonlinear approaches (e.g., t-SNE), often assume continuous Euclidean geometry, thereby misaligning with the discrete, sparse nature of low-rate count data. Here, we propose p-SNE (Poisson Stochastic Neighbor Embedding), a nonlinear neighbor embedding method designed around the Poisson structure of count data, using KL divergence between Poisson distributions to measure pairwise dissimilarity and Hellinger distance to optimize the embedding. We test p-SNE on synthetic Poisson data and demonstrate its ability to recover meaningful structure in real-world count datasets, including weekday patterns in email communication, research area clusters in OpenReview papers, and temporal drift and stimulus gradients in neural spike recordings.

artificial intelligence, embedding, machine learning, (15 more...)

arXiv.org Machine Learning

2604.16932

Country: North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Bayesian optimization for automated model selection

Gustavo Malkomes, Charles Schaff, Roman Garnett

Neural Information Processing SystemsMar-23-2026, 06:54:06 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, kernel, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Locality-Sensitive Hashing for f-Divergences: Mutual Information Loss and Beyond

Lin Chen, Hossein Esfandiari, Gang Fu, Vahab Mirrokni

Neural Information Processing SystemsFeb-19-2026, 10:53:52 GMT

Neural Information Processing Systems http://nips.cc/

divergence, hash function, kernel, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)
Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Information-theoretic Limits of Online Classification with Noisy Labels

Neural Information Processing SystemsFeb-18-2026, 05:21:59 GMT

Learning from noisy data is a fundamental problem in many machine learning applications.

artificial intelligence, kernel, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Hawaii (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Additional definitions

Neural Information Processing SystemsFeb-16-2026, 08:40:16 GMT

We provide the definitions of important terms used throughout the paper. Assumption 2.3 when the demand distribution is exponential. Note that Lemma B.1 implies that In the following result, we show that there exist appropriate constants such that prior distribution satisfies Assumption 2.3 when the demand distribution is a multivariate Gaussian with unknown The proof is a direct consequence of Theorem 3.2, Lemmas B.6, B.7, B.8, B.9, and Proposition 3.2. Theorem 6.19] the prior induced by Assumption 2.2 is a direct consequence of Assumption 2.4 and 2.5 are straightforward to satisfy since the model risk function Lemma B.13. F or a given Using the result above together with Proposition 3.2 implies that the RSVB posterior converges at C.1 Alternative derivation of LCVB We present the alternative derivation of LCVB. We prove our main result after a series of important lemmas.

artificial intelligence, assumption 2, equation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.93)

Add feedback